This study focuses on embodied agents that can follow natural language instructions to complete complex tasks in a visually-perceived environment. Existing methods rely on a large amount of (instruction, gold trajectory) pairs to learn a good policy. The high data cost and poor sample efficiency prevents the development of versatile agents that are capable of many tasks and can learn new tasks quickly. In this work, we propose a novel method, LLM-Planner, that harnesses the power of large language models (LLMs) such as GPT-3 to do few-shot planning for embodied agents. We further propose a simple but effective way to enhance LLMs with physical grounding to generate plans that are grounded in the current environment. Experiments on the ALFRED dataset show that our method can achieve very competitive few-shot performance, even outperforming several recent baselines that are trained using the full training data despite using less than 0.5% of paired training data. Existing methods can barely complete any task successfully under the same few-shot setting. Our work opens the door for developing versatile and sample-efficient embodied agents that can quickly learn many tasks.
translated by 谷歌翻译
分布式机器学习实现可扩展性和计算卸载,但需要大量的通信。因此,分布式学习设置中的沟通效率是一个重要的考虑因素,尤其是当通信是无线且采用电池驱动设备时。在本文中,我们开发了一种基于审查的重球(CHB)方法,用于在服务器工作者体系结构中分布式学习。除非其本地梯度与先前传播的梯度完全不同,否则每个工人的自我审查员。 HB学习问题的显着实际优势是众所周知的,但是尚未解决降低通信的问题。 CHB充分利用HB平滑来消除报告的微小变化,并证明达到了与经典HB方法相当的线性收敛速率,以平滑和强烈凸出目标函数。 CHB的收敛保证在理论上是合理的,对于凸和非凸案。此外,我们证明,在某些情况下,至少可以消除所有通信的一半,而不会对收敛率产生任何影响。广泛的数值结果验证了CHB在合成和真实数据集(凸,非凸和非不同情况)上的通信效率。鉴于目标准确性,与现有算法相比,CHB可以显着减少通信数量,从而实现相同的精度而不减慢优化过程。
translated by 谷歌翻译
在联合学习(FL)中,通过跨设备的模型更新进行合作学习全球模型的目的倾向于通过本地信息反对个性化的目标。在这项工作中,我们通过基于多准则优化的框架以定量的方式校准了这一权衡,我们将其作为一个受约束的程序进行了:设备的目标是其本地目标,它试图最大程度地减少在满足非线性约束的同时,以使其满足非线性约束,这些目标是其本地目标。量化本地模型和全局模型之间的接近度。通过考虑该问题的拉格朗日放松,我们开发了一种算法,该算法允许每个节点通过查询到一阶梯度Oracle将其Lagrangian的本地组件最小化。然后,服务器执行Lagrange乘法器上升步骤,然后进行Lagrange乘法器加权步骤。我们称这种实例化的原始偶对方法是联合学习超出共识($ \ texttt {fedBc} $)的实例。从理论上讲,我们确定$ \ texttt {fedBc} $以与最算好状态相匹配的速率收敛到一阶固定点,直到额外的错误项,取决于由于接近性约束而产生的公差参数。总体而言,该分析是针对非凸鞍点问题的原始偶对偶的方法的新颖表征。最后,我们证明了$ \ texttt {fedBc} $平衡了整个数据集(合成,MNIST,CIFAR-10,莎士比亚)的全球和本地模型测试精度指标,从而与艺术现状达到了竞争性能。
translated by 谷歌翻译
我们研究了开发自主代理的问题,这些自主代理可以遵循人类的指示来推断和执行一系列行动以完成基础任务。近年来取得了重大进展,尤其是对于短范围的任务。但是,当涉及具有扩展动作序列的长匹马任务时,代理可以轻松忽略某些指令或陷入长长指令中间,并最终使任务失败。为了应对这一挑战,我们提出了一个基于模型的里程碑的任务跟踪器(M-Track),以指导代理商并监视其进度。具体而言,我们提出了一个里程碑构建器,该建筑商通过导航和交互里程碑标记指令,代理商需要逐步完成,以及一个系统地检查代理商当前里程碑的进度并确定何时继续进行下一个的里程碑检查器。在具有挑战性的Alfred数据集上,我们的M轨道在两个竞争基本模型中,未见成功率的相对成功率显着提高了33%和52%。
translated by 谷歌翻译
从其三阶统计或BISPectrum的傅立叶变换中检索信号的检索出现在各种信号处理问题中。常规方法不提供双谱的独特反演。在本文中,我们介绍了一种方法,该方法从其BISPectrum函数(BF)的至少3B $测量值,唯一地恢复具有有限频谱支持(带限量信号)的信号,其中$ B $是信号的带宽。我们的方法也延伸到有限的信号。我们提出了一种两步信任区域算法,可最大限度地减少非凸面目标函数。首先,我们通过光谱算法近似信号。然后,我们基于一系列渐变迭代序列来优化实现的初始化。数值实验表明,我们的所提出的算法能够为完整和未采样的观察估计来自其BF的带/时间有限的信号。
translated by 谷歌翻译
在本文中,我们使用基于视觉的图形聚合和推理(VGAI)呈现了一种感知 - 动作通信环路设计。这种多代理分散的学习 - 控制框架将原始的视觉观测映射到代理操作,并通过相邻代理之间的本地通信提供帮助。我们的框架是由圆形卷积和图形神经网络(CNN / GNN)的级联实现,寻址代理级视觉感知和特征学习,以及群级通信,本地信息聚合和代理动作推断。通过联合训练CNN和GNN,结合了解图像特征和通信消息以更好地解决特定任务。我们使用模仿学习在离线阶段训练VGAI控制器,依赖于集中式专家控制器。这导致学习的VGAI控制器可以以分布式方式部署以进行在线执行。此外,控制器展示了良好的缩放性质,在较大的团队中具有较小的团队和应用程序的培训。通过多代理植入应用程序,我们证明VGAI产生与其他分散的控制器相当或更好地使用视觉输入模态,而不访问精确的位置或运动状态信息。
translated by 谷歌翻译
Machine learning (ML) has found broad applicability in quantum information science in topics as diverse as experimental design, state classification, and even studies on quantum foundations. Here, we experimentally realize an approach for defining custom prior distributions that are automatically tuned using ML for use with Bayesian quantum state estimation methods. Previously, researchers have looked to Bayesian quantum state tomography due to its unique advantages like natural uncertainty quantification, the return of reliable estimates under any measurement condition, and minimal mean-squared error. However, practical challenges related to long computation times and conceptual issues concerning how to incorporate prior knowledge most suitably can overshadow these benefits. Using both simulated and experimental measurement results, we demonstrate that ML-defined prior distributions reduce net convergence times and provide a natural way to incorporate both implicit and explicit information directly into the prior distribution. These results constitute a promising path toward practical implementations of Bayesian quantum state tomography.
translated by 谷歌翻译
Generalizability of time series forecasting models depends on the quality of model selection. Temporal cross validation (TCV) is a standard technique to perform model selection in forecasting tasks. TCV sequentially partitions the training time series into train and validation windows, and performs hyperparameter optmization (HPO) of the forecast model to select the model with the best validation performance. Model selection with TCV often leads to poor test performance when the test data distribution differs from that of the validation data. We propose a novel model selection method, H-Pro that exploits the data hierarchy often associated with a time series dataset. Generally, the aggregated data at the higher levels of the hierarchy show better predictability and more consistency compared to the bottom-level data which is more sparse and (sometimes) intermittent. H-Pro performs the HPO of the lowest-level student model based on the test proxy forecasts obtained from a set of teacher models at higher levels in the hierarchy. The consistency of the teachers' proxy forecasts help select better student models at the lowest-level. We perform extensive empirical studies on multiple datasets to validate the efficacy of the proposed method. H-Pro along with off-the-shelf forecasting models outperform existing state-of-the-art forecasting methods including the winning models of the M5 point-forecasting competition.
translated by 谷歌翻译
Identifying spurious correlations learned by a trained model is at the core of refining a trained model and building a trustworthy model. We present a simple method to identify spurious correlations that have been learned by a model trained for image classification problems. We apply image-level perturbations and monitor changes in certainties of predictions made using the trained model. We demonstrate this approach using an image classification dataset that contains images with synthetically generated spurious regions and show that the trained model was overdependent on spurious regions. Moreover, we remove the learned spurious correlations with an explanation based learning approach.
translated by 谷歌翻译
解释性互动学习(XIL)收集了有关视觉模型解释的用户反馈,以实现基于人类的交互式学习方案。不同的用户反馈类型将对用户体验以及收集反馈相关的成本产生不同的影响,因为不同的反馈类型涉及不同级别的图像注释。尽管XIL已被用来改善多个域中的分类性能,但不同的用户反馈类型对模型性能和解释精度的影响尚未得到很好的研究。为了指导未来的XIL工作,我们比较图像分类任务中两种不同用户反馈类型的有效性:(1)指示算法忽略某些虚假图像特征,以及(2)指导算法专注于某些有效的图像特征。我们使用基于梯度加权类激活映射(GARGCAM)XIL模型的解释来支持两种反馈类型。我们表明,与用户反馈相比,识别和注释的虚假图像特征与用户反馈相比,该模型可以找到出色的分类和解释精度,该功能告诉模型专注于有效的图像特征。
translated by 谷歌翻译